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Abstract 

The aim of this paper is to show that Dempster-Shafer 
Theory (DST) and a recent theory of plausible and 
paradoxical reasoning introduced by Dezert and 
Smaradache and thus called Dezert-Smarandache 
Theory (DSmT), can be successfully applied to improve 
a supervised classification of remotely sensed data . 
Notice that application fields of these two theories are 
related on multisensor/multitemporal/multiscale data 
fusion . In this study, our contribution lies in developing 
a new multispectral data classification process which 
can be seen as a multisensor fusion process where each 
thematic class is considered as one source of 
information . 

1. Introduction 

Given the current available techniques, remote 
sensing is recognized as a timely and cost-effective tool 
for earth observation and land monitoring. It constitutes 
the most feasible approach to both land surface change 
detection, and land-cover information required for the 
management of natural resources. The extraction of land- 
cover information is usually achieved through supervised 
or unsupervised classification methods. In unsupervised 
classification, an algorithm such as K- means or Isodata, 
is chosen that will take a remotely sensed data set and 
find a pre-specified number of statistical clusters in 
multispectral space. Although these clusters are not 
always equivalent to actual classes of land cover, this 
method can be used without having prior knowledge of 
the ground cover in the study site. Supervised 
classification, however, does require prior knowledge of 
the ground cover in the study site. The process of gaining 
this prior knowledge is known as ground-tmthing. With 
supervised classification algorithms such as Maximum 
Likelihood or minimum of distance, the researcher 
locates areas on the unmodified image for which he 
knows the type of land cover, defines a polygon or a 
polyline around the known area, and assigns that land 
cover class to the pixels within the polygon or the 
polyline. This process known as training step is 
continued until a statistically significant number of pixels 
exist for each class in the classification scheme. Then, 



the multispectral data from the pixels in the sample 
polygons are used to train a classification algorithm. 
Once trained, the algorithm can then be applied to the 
entire image and a final classified image is obtained. In 
this work we propose a novel supervised classification 
approach. 

Conventional supervised classifiers are statistical and 
very often based on the Bayesian theory which has been 
proved as a theoretically robust foundation for satellite 
image classification [3] [13]. However, the main 
limitation of a Bayesian formalism is that it cannot 
represent imprecision about uncertainty measurement 
and is able to consider only single (or individual) classes, 
which may lead to misclassification especially face to 
mixed pixels. To overcome these problems, Dempster- 
Shafer Theory (DST) [6] [14] and a new theory of 
plausible and paradoxical reasoning introduced by 
Dezert and Smaradache [7] [15] and thus called Dezert- 
Smarandache Theory (DSmT) were used as they offer an 
appropriate mathematical framework for the modeling of 
both imprecision and uncertainty and have the ability to 
consider not only singletons but also compound classes 
such as union of classes in DST’s model and intersection 
of classes in DSmT’s model. 

The remainder of the paper is organised as follows. In 
the next section, we recall the mathematical basis of DST 
and DSmT and their application to fusion process. 
Section 3 deals with the way DST and DSmT can be 
used to multispectral classification of remotely sensed 
data. Section 4 shows the obtained results when applying 
the proposed classification methodology on a real 
satellite data acquired by ETM+ sensor of LandSat 7 
satellite. These results are discussed and compared to a 
Bayesian result. Finally, Section 5 gathers our 
conclusions. 

2. DST and DSmT basis 

The evidence theory developed by Dempster [6] and 
better formalized by Shafer [14] enables to represent 
both uncertainty and imprecision, and was initially 
introduced to fuse a conflicting information sources. The 
plausible and paradoxical theory [7] [15] is a 
generalization of the classical DST which allows to 
formally combining any types of information sources: 
rational, uncertain or paradoxical. Notice that the two 
theories are based on the definition of an elementary 
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mass function, from which are derived plausibility and 
belief (or credibility) functions. 

2.1. DST Basis 



sources S t ( 1 < i < n ) defined on the same frame of 
discernment are defined, it is then possible to combine 
them according to the DSTs orthogonal combination rule 
symbolized by the operator 0. This rule results in: 



2.1.1. Elementary notions 



The theory of evidence needs the definition of a frame 
of discernment 0 including k exclusive hypothesis 9j , 



the k classes in our case where 1 < i < k . A referential 

2 0 represents the set of all subsets of 0 . Plausibility 
and credibility functions can be expressed with a unique 
function, the mass function m( ). Mass, plausibility and 

credibility, which are all defined from 2® on the 
interval [0, 1], characterize the likelihood of any subset 

Aj where 1 < i < 2® of 2®. The mass function is 
defined as: 



\m : 2 0 [ 0 , 1 ] 

1 m(A t ) 

= 0 

Y m (A t ) = 1 

A 2 0 

m(A i ) > 0, ^ c 2 0 



( 1 ) 



( 2 ) 



Where (/> represents the empty set. 

Focal elements are the elements A,- of 2 0 in which the 
mass function m(Ai) is not null. 



The plausibility and belief functions are given by: 



Pl(A l ) = 




VA t e 2 e 


(3) 




A*e 2® / AjnAj *<j> 






Bel ( A, ) = 


-■ ^ m{Aj) 


VA, € 2® 


(4) 




Aje2 & / A / cA ! - 







The notion of plausibility can be introduced in relation to 
the notion of credibility. It is defined as: 



Pl(A i ) = l-Bel(A i ) ( 5 ) 

The uncertainty about a focal element A is represented by 
the values of the interval [Bel(A) f Pl(A)], which is called 
the “belief interval” and the length of this interval gives a 
measurement of the imprecision about the uncertainty 
value. 



0 m p (AJ = 

f r k (A i ) = m) = r^r et 2 e (6) 

1=1 l—m c ((p) 

Where m c ((f>) represents the mass assigned to empty 

set and is often interpreted as a measure of conflict 
between the different n sources. It is given as follows: 

m c ((j»= 2 n m i {A i ) V A 6 20 (7) 

A 1 n...nA„=^ j=l 

More details about the mathematical properties of 
DST can be found in reference [14]. In particular, it is 
shown that the DST’s rule of combination is 
commutative and associative, which allows one to 
combine the available sources in any order. 

2.1.3. Evidential decision rule 

After combination of the different sources, a decision 
is made according to a certain criteria. Several decision 
rules have been proposed: 1) maximum of plausibility 
which is judged as the best by some authors [3] [4] [11] 
[12], 2) maximum of belief over the simple hypothesis 
which is the most used [11], and 3) maximum of belief 
without overlapping of belief intervals which is very 
strict and called absolute decision rule [4] [11] [12]. 

2.2. DSmT Basis 

The DSmT of plausible, uncertain and paradoxical 
reasoning [7] [8] [9] [15] is a generalization of the 
classical DST [6] [14] which allows to formally combine 
any types of sources of information (rational, uncertain 
or paradoxical). The DSmT is able to solve complex 
data/information fusion problems where the DST usually 
fails, especially when conflicts (paradoxes) between 
sources become large and when the refinement of the 
frame of discernment 0 is inaccessible because of the 
vague, relative and imprecise nature of 0 elements. The 
foundation of DSmT is based on the definition of the 
hyperpowerset D 0 (Dedekind’s lattice) [8] [9] of a 
general frame of discernment 0 . 

2.2.1. Notion of hyper-powerset D 0 



2.1.2. Evidential combination rule of Dempster 

Once the evidence functions (masse, Plausibility and 
Belief) associated to each n independent information 



The foundation of DSmT is based on the definition of 
the hyper-powerset D 0 [8] [9]. Let 0 be a set of k 

elements 9j , 2 0 commonly named a power-set is a set 
of subsets of 0 when all 9 t are disjoints. The hyper- 
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powerset D 0 is defined as the set of all composite 
propositions built from elements of 0 with VJ and D 
operators such that: 

VA. e D & , VA; g D 0 , (A i u A^)€ D & and 

(A z r\Aj)e D e . (8) 

From a general frame of discernment 0, is defined a 
quantity m(A) called the generalized basic belief mass for 
A such that: 



m{(f>) = 0 et ^ j m(A i ) = 1 (9) 

A^D® 

The plausibility and belief functions are defined in 
almost the same manner as within the DST, it means: 

PI (A t ) = VA ( € D 0 (10) 

Aje D ® / A i nA j *** 

Bel(A i )= VA,gD® (11) 

These definitions are compatible with the DST 
definitions when the sources of information become 
uncertain but rational (they do not support paradoxical 
information). We still have: 

VA,gZ) 0 ? fieZ(4.)<P/(4) (12) 

2.2.2. Paradoxical combination rule of Dezert 

Let Bel }( .) and Bel 2 (.) be two belief functions over the 
same frame of discernment 0 and their corresponding 
generalized basic belief mass mi(.) and m 2 (.) provided by 
two distinct but potentially paradoxical sources of 
evidences et S 2 . Then the combined global belief 
function Bel(.) = Bel t (.)® Bel 2 Q associated to the 
fusion process of the two sources, is obtained by 
combining the information granules mj( .) and m 2 (.) 
through the Dezert ’s rule of combination given by: 

V^eZ) 0 , m(4)sfo©^](4)= X>M A ,)mJA) d 3 ) 

A J> A k sD & t A J nA k =A i 

For n sources of evidence Sj , the generalized form is 
given by : 



VAeZ) 0 , m(A)s©(A)= Y fl mXA.) (14) 

;=1 AjeD® i=1 

Ain.-.riA^A# 

This rule of combination is commutative and associative 
and can always be used for the fusion of paradoxical or 
rational sources of information (bodies of evidence). 



2.2.3. Paradoxical decision rule 

The decision rule in the framework of DSmT fusion 
process is defined in almost the same manner as within 
the DST, it means by choosing one of the three 
mentioned criteria. 

2.3. Definition of DST and DSmT mass 
functions 

The determination of mass functions in DST and 
DSmT represents a crucial step in a fusion process and 
remains a largely unsolved problem, which did not yet 
find a general answer. In image processing, Bloch [2] [ 
3] dresses three different levels from where a mass 
function may be derived: at the highest level where 
information representation is used in a way similar to that 
in artificial intelligence and masses are assigned to 
propositions, at an intermediate level, masses are 
computed from attributes, and may involve simple 
geometrical models, at the pixel level, mass assignment 
is inspired from statistical pattern recognition. Recall that 
the difficulty increases when we are interested on the 
compound hypothesis and their mass functions. The most 
widely used approach is to assign to simple hypotheses 
masses that are computed from conditional probabilities. 
Then a transfer model is introduced to distribute the 
initial masses over all compound hypothesis (union of 
classes in DST and intersection of classes in DSmT). 
This transfer operation is done through a coarsening 
(discounting) factor and/or a conditioning factor 
applying to the conditional probabilities (initial masses). 

The literature reported several transfer models: the 
transferable belief model [16], the upper and lower 
probability model [6], the parametric model [16], the 
consonant model [11] [12], the dissonant model [1], etc. 
In this paper, the mass functions are estimated using a 
dissonant model of Appriou [1] that was initially 
developed for two classes only as follows: 



r & i B i -P ( x / O t ) 

‘ lJ 1+ Rj.P(x/0 t ) 


(15) 


m t ( i 1( x ) = 

1 1 lJ 1 + R r P{x!O i ) 


(16) 


m j (0)(a:) = 1 - cc t , 


(17) 



Where P(x ! 0 ) is the conditional probability, a { is a 
coarsening factor, and R. represents a normalization 

factor that is introduced in the axiomatic approach in 
order to respect the mass and plausibility definitions, and 
is given by: 




i max [sup( 



p (X / <9,.)]jj (18) 
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3. DST and DSmT classification approach 

In remote sensing, first applications of DST and 
DSmT were developed in the framework of 
multisensor/multitemporal/multiscale data fusion [10] 
[11] [12] [15]. However, in recent studies, the thematic 
application of DST and DSmT concerns land use and 
land cover mapping, sometimes, by considering temporal 
changes [5]. 

In the present section, we describe the proposed 
supervised classification approach based on DST and 
DSmT with the main objective to improve a Bayesian 
classification. The adopted methodology is as follows 
[4]: 




Figure 1. The proposed DST and DSmT 
classification methodology 



1. According to an “a prior 99 knowledge, two data bases 
are constructed: a training base to be used in a supervised 
classification process, and a test base to be used during 
the assessment of the classification accuracy. 

2. A Bayesian classification is performed using a 
maximum likelihood algorithm. 

3. A confusion matrix is established between a Bayesian 
classification result and a test data base. 

4. For each class, a coarsening factor is obtained from 
the confusion matrix and it can be seen as the accuracy of 
that class which is computed by dividing the total 
number of correct pixels in that class by either the total 
number of pixels in that category as derived from the test 
data base. 

5. Mass functions of the individuals and the compound 
classes are estimated through a transfer model of 
Appriou that we have generalized and extrapolated for 
more than two classes as follows: 



CC..R >.P(x i 0 t ) (2 0 - k -1)* 8 

m AO ) = —!—L — iL-il J (19) 

7 l + R r P(xie t ) k 

nij (9 X ) = m j (0 2 ) = ntj (6>._ 1 ) = m. (<9. +1 ) = 



m j (0 k ) = 



a t /(Jk-1) 
l + R.-Pix/B,)' 



(2® -k-l)*e 



( 20 ) 



miQy uft)=m ; (^u^)=. u..uQ A )=£, Ve>0 (21) 



mA®) = l-a i 



( 22 ) 



Where k is the number of the considered classes and s is 
a sensitivity factor that weighted the mass functions in 
order to have their sum over all the hypothesis equal to 1 . 

6. A combination rule of DST or DSmT is applied. For 
each pixel to classify, its mass functions are combined as 
follows : 



= {fity © m2 © © m q)(^ ) = 

(m^ o ^2 ° ° 






\/A^ & <f> et Aj <= 2 



,© 



(23) 



Where m c (^) is the global conflict degree between all 

sources (classes in our case) and is computed for each 
pixel to be classified. 

7. Finally, a multispectral classification is released 
according to a decision rule. We have chosen a 
maximum of belief criterion. 

4. Results and discussion 

The DST/DSmT classification algorithm we described 
was applied to improve the Bayesian classification result, 
using a multispectral ETM+ image acquired by Landsat 7 
satellite on June 2001. The RGB composition of the data 
set which covers the north-eastern part of Algiers 
(Algeria), and the selected data bases of training and 
testing our algorithms are respectively given by Figure 2, 
Figure 3, and Figure 4. Four thematic classes dominate 
the study site: Dense Urban (DU), Less Dense Urban 
(LDU), Vegetation (V), and Bare Soil (BS). 

The Bayesian classification result based on a 
maximum likelihood algorithm is shown on Figure 5. 
The assessment of this result relatively to the considered 
test data gives a confusion matrix of Table 1 on which it 
is clearly shown a large confusion between DU and 
LDU, and between BS and LDU. A conflict degree 
between the considered classes belongs to [0.44, 0.49] 
and is given by the image of Figure 6. The DST and 
DSmT classification results are given respectively on 
Figure 7 and Figure 8. Table 2 shows the different land 
cover types present on the study site, obtained 
respectively from a Bayesian, a DST, and a DSmT 
approaches. 



0-7803-952 1-2/06/$2 0.00 ©2006 IEEE. 



386 















Figure 2. RGB Figure 3. Training data 
composition base 




Figure 4. Test data Figure 5. Bayesian result 

base 



Table 1. Confusion matrix 





St (DU) 


S 2 (LDU) 


Ss(V) 


S 4 (BS) 


DU 


■ 


28 


14 


00 


00 


LDU □ 


23 


71 


00 


17 


V 


■ 


00 


00 


212 


13 


BS 


n 


01 


37 


13 


125 




Figure 6. Conflict image 




Figure 7. DST classification 
(conflict threshold=0.465) 



■ DU 
I I LDU 
□ V 
H BS 
I I DUuLDU 
I I DUuV 
I I DUuBS 

□ LUDuV 
I I LUDuBS 
I VuBS 




Figure 8. DSmT classification 
{paradox threshold=0.645 



H DU 
I I LDU 
□ V 
■ BS 
I I DUnLDU 
I I DUnV 
I I DUnBS 

□ LDUnV 
■ LDUnBS 
I VnBS 



Table 2. Percentage of the different land cover 
types in a Bayesian, DST, and DSmT approaches 



Class 


Bayesian 
dass cover 

(%) 


DST 

class cover 
(%) 


DSmT 

class cover 
(%> 


DU 


8.59 


0.77 


2.03 


LDU 


35.18 


32.22 


23.15 


V 


24.24 


24.00 


31.57 


| BS 


31.99 


5.41 


4.50 


DUuLDU 


0 


8.93 


0 


DUuV 


0 


0.04 


0 


DUuBS 


0 


0.08 


0 


LDUuV 


0 


0.72 


0 


LDUuBS 


0 


26.51 


0 


i VuBS 


0 


1.31 


0 


DUnLDU 


0 


0 


17.36 


DUnV 


0 


0 


0.00 


i DUnBS 


0 


0 


0.00 


LDUnV 


0 


0 


4.29 


LDUnBS 


0 


0 


14.46 


VnBS 


0 


0 


2.64 



It is known that a Bayesian classification result has 
often a "salt-and-pepper" noise appearance due to many 
miss-classified pixels especially those located at the 
segment borders or extremities of the classes. The 
suggested DST and DSmT classifiers aim to improve the 
Bayesian land cover map by tacking into account the 
imprecision and the uncertainty of the acquired data. 

DST classifier leads to a land-cover map constituted 
of “pure zones” being to individual classes (DU, LDU, 
V, and BS) and “mixed zones” (or ambiguous zones) 
being to the union of classes (DUuLDU, DUuV, 
DUuBS, LDUuV, LDUuBS, and VuBS). A decision 
rule is based on a maximum of belief according to a 
threshold chosen by the user to decide of the desired 
conflict degree. As it is seen on Table 2, the land cover 
types which are the most conflicting on the site are LDU 
and BS, and they represent 26.51 % of the site. 

DSmT classifier leads to a land-cover map constituted 
of “pure zones” being to individual classes (DU, LDU, 
V, and BS) and “paradoxical zones” (or very conflicting 
zones) being to the intersection of classes (DUnLDU, 
DUnV, DUnBS, LDUnV, LDUnBS, and VnBS). A 
decision rule is based on a maximum of belief according 
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to a specified-user threshold to decide of the desired 
degree about the uncertainty on the paradox between the 
classes. Notice that unlike DST classification result 
where the union of classes represents the ignorance 
existing between these classes, in DSmT classification 
result, the intersection of classes represents new spectral 
classes having a common spectral response between 
those of classes of the intersection. For example, 
DUnLDU is a thematic class between a dense urban and 
a less dense urban, which is surely an urban zone but 
with an average density. Thanks to the DSmT classifier, 
finer heterogeneous classes may be detected. As it is 
shown on Table 2, the land cover types which are the 
most paradoxical on the site are DU and LDU, and they 
represent 17.36% of the site, and LDU and BS, and they 
represent 14.46 % of the site. 

5. Conclusion 

Two supervised classifiers of multispectral remotely 
sensed data have been presented in this paper. The first 
one is based on Dempster-Shafer Theory (DST) and the 
second based on Dezert-Smarandach Theory (DSmT). 
The main purpose of these classifiers is to improve the 
result of the Bayesian classification by modeling the 
conflicting/paradoxical nature of the considered classes. 

The particularity of the proposed methodology is that 
we are dealing with a thematic class as one source of 
information or a sensor. So, DST and DSmT have been 
adapted to consider this modification in order to design 
multispectral classifiers through a multisource fusion 
process. The most important step in the framework of 
DST/DSmT fusion process is the definition of the mass 
function. In this work we have adopted a transfer model 
of Appriou that we have generalized for more than two 
sources of information (thematic classes). The 
combination rule of Dempsetr/Dezert allows to combine 
the mass functions of individual and compound classes, 
and using a criterion of maximum of belief a most 
realistic class of each pixel is selected. In this manner, 
DST classifier attributes union of classes to the 
conflicting pixels, and DSmT classifier attributes 
intersection of classes to the paradoxical (very 
conflicting) pixels. These proposed classifiers have 
effectively improved a Bayesian classification result on 
which only about 60% of land cover types has been 
confirmed as “a pure zone”, the remainder 40% has been 
detected as “a mixed zone”. For further work, it would 
have been interesting to see a spectral behavior of the 
compound classes through an analysis of their spectral 
signature. 

Finally, this study shows that DST and DSmT 
represent a powerful mathematical tool to design 
successfully multispectral classifiers of satellite data. 
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